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Abstract 

Gene regulatory circuits show significant stochastic fluctuations in their circuit signals due to the low 
copy number of transcription factors. When a gene circuit component is connected to an existing circuit, 
the dynamic properties of the existing circuit can be affected by the connected component. In this 
paper, we investigate modularity in the dynamics of the gene circuit based on stochastic fluctuations in 
the circuit signals. We show that the noise in the output signal of the existing circuit can be affected 
significantly when the output is connected to the input of another circuit component. More specifically, 
the output signal noise can show significantly longer correlations when the two components are connected. 
This equivalcntly means that the noise power spectral density becomes narrower. We define the relative 
change in the correlation time or the spectrum bandwidth by stochastic retroactivity, which is shown 
to be directly related to the retroactivity defined in the deterministic framework by del Vecchio et al. 
This provides an insight on how to measure retroactivity, by investigating stochastic fluctuations in gene 
expression levels, more specifically, by obtaining an autocorrelation function of the fluctuations. We also 
provide an interesting aspect of the frequency response of the circuit. We show that depending on the 
magnitude of operating frequencies, different kinds of signals need to be preferably chosen for circuit 
description in a modular fashion: at low enough frequency, expression level of transcription factor that 
are not bound to their specific promoter region needs to be chosen, and at high enough frequency, that 
of the total transcription factor, both bound and unbound, does. 

Introduction 

In biology, the word modularity has multiple meanings depending on context. It can for example represent 
a set of co-expressed genes [1], a topological unit in a network [2] or a self-contained component such 
as an organ. In this paper we will define a module as a self-contained functional unit whose intrinsic 
properties arc independent of the surrounding milieu. This definition is essentially the same definition 
used in engineering. For example, the intrinsic properties of a TTL NAND gate [3] is unaffected (within 
certain design constraints) when connected to other TTL logic gates. It is the property that makes the 
engineering of complex electronic circuits possible. It allows engineers to design, predict and fabricate 
with a high degree of reliability. The question whether such self-contained and functionally independent 
modules exist at the biological cellular network level is still an ongoing research problem [4]. In this 
article we will be concerned with the design of modular synthetic components. Given the recent rise in 
interest in synthetic biology [5-9], the ability to design and build novel cellular systems in a modular 
fashion is a highly desirable goal. 

In the most abstract sense we can define a module as follows. Given a functional unit M with input I 
and output O, we can define a relation between the input and output as O = M(I). Given two functional 
units, Mi and M2, where the output of Mi serves as the input to M2, then Mi and M2 are defined as 
modules if the relation, O2 = M2(Mi(7i)) is true (Fig. [TJ . This simply means that in connecting Mi and 
M2 together, M2 has no effect on the functional characteristics of Mi and vice versa. 

Del Vecchio ct al. [10] introduced a measure called retroactivity that allowed one to determine the 
influence that a downstream module had on the characteristics of an upstream module. The higher the 
retroactivity the greater the influence from the downstream module and the less modular the functional 
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upstream units. More specifically, they considered a gene circuit module where the input output signals 
were the concentrations of transcription factors, wiring the module to another (see Fig.[T|). ft was shown 
that a module could respond to perturbations much more slowly when the output was wired to the input 
of the next module [10]. The wiring involves binding-unbinding processes between TFs and their specific 
promoter regions. The lifetime of the bound TFs was assumed to be much larger than that of the unbound 
TFs. If the processes, initially at equilibrium, are perturbed in such a way that the copy number of the 
free TF is abruptly increased, a portion of the increased amount will bind to the promoter region if any 
regions are not occupied. This results in a quasi-cquilibrium in the binding-unbinding process [11,12]. 
The promoters act as a reservoir holding the TFs: Whenever there is any change in the number of the free 
TFs, the reservoir quickly buffers the change in the number of TFs [13]. The slow-down in the dynamic 
response in the upstream circuit component was quantified by a retroactivity measure [10]. Del Vecchio 
et al. proposed a mechanism for reducing the retroactivity to achieve modular analysis of the circuit 
dynamics. 
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Figure 1. Gene circuit modules: Input (I) and output (O) signals are TF concentrations. The I/O 
relationships of the modules, Mi and M2, are given by the solutions of ordinary differential equations, 
describing the temporal changes in the concentration levels of the TFs, in the deterministic case. 
Module 2 affects the dynamics of module 1 due to retroactivity on module 1 by module 2, 7*2 1. 

In this manuscript, we focus on an operational method for measuring the retroactivity, by exploiting 
gene expression noise. The gene circuits have been known to show significant stochastic fluctuations in 
their gene expression levels [14-21] (for review articles, [22-25]). Stochastic noise often involves useful 
information that is not available by observing the mean values [26] . The noise in the gene expression can 
be considered as an outcome of continuous perturbations applied to the gene circuits. Thus, the system's 
dynamic properties can be obtained from the noise. In this manuscript, we investigate the measurable 
changes in the noise statistics due to connections between gene circuit components and propose a novel 
practical method to measure retroactivity in vivo to quantify the level of modularity of the dynamics of 
the circuit components. 

We show that the output signal noise can exhibit a significantly longer correlation in the output signal 
when the signal is used as an input of a downstream circuit component. The change in the temporal 
correlation will be quantified by an autocorrelation function [27] (for gene expression studies, [18-21]), 
which indicates the correlation of the signal (X) with itself for a given time-lag (r): 

G x (r) = ((X t+r -{X))(X t -(X))), (1) 

where X t is the signal amplitude at time t and is presumed at a stationary state fluctuating with respect 
to a constant mean. The angle bracket denotes an average over a time series. The observed longer 
correlation in the signal means that the autocorrelation will decrease more slowly with the time-lag, 
which equivalently means that the signal noise has a narrower bandwidth in the frequency spectrum. We 
can quantify such changes in the autocorrelation or the frequency spectrum by introducing a stochas- 
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tic retroactivity. We show that this stochastic retroactivity (s-retroactivity) is directly related to the 
retroactivity investigated in the deterministic framework (d-retroactivity) [10]. 

We also investigate the change in the frequency response of the gene circuit due to the circuit connec- 
tion by observing the autocorrelation functions, more specifically power spectral densities - the Fourier 
transforms of the autocorrelation functions. If circuit signals can be chosen such that they are less af- 
fected by the connection, the chosen signals would be preferred by modelers since the circuit dynamics 
can be analyzed in a modular way. We show that the proper choice is dependent of the magnitude of the 
frequency where the circuits are operated. Our analysis on the power spectral densities show that the 
concentrations of the free TFs are not affected by the wiring at low frequencies (of the order of a day) 
circuit operation, while those of the total TFs, including free and bound, at high frequency (of the order 
of an hour) operation. This implies that depending on the magnitude of the operating frequency, the 
signal output can be chosen either the free or total TFs to describe its dynamics in a modular fashion. 

Autocorrelation functions have been experimentally measured to understand the noise power spectrum 
in gene expression levels of a HIV transcriptional circuit [20] and TetR negative feedback circuit [19]. 
Autocorrelation has also been used to investigate the properties of intrinsic and extrinsic noise [18] and 
to analyze regulatory interactions in a CRP-GalS-GalE feed-forward circuit [21]. Here we offer a new 
application of autocorrelation functions in relation to the analysis of modularity. 



Results 

Retroactivity: Relative increase in the correlation time of output signal 

Let us define stochastic retroactivity based on the change in the autocorrelation function of the output 
signal noise of a module due to the connection of the output to the input of another module. Consider 
the translation and degradation process of a TF protein (X). We presume this process serves as a simple 
module, or a portion of a module processing the module output signal (X). We will compare two cases: 
the module output is wired and not wired (see Fig. [2]). We first consider the case when not wired: No 
wiring means that the TF cannot bind to a TF-specific promoter region, e.g., perhaps the TF specific 
binding regions of promoters are absent. In such cases, all of the TFs are in an unbound state. If the TF 
has a fluorescence tag, the fluorescence intensity will reflect the copy number of the unbound TFs. We 
can model the translation-degradation process of the TF as: 

X ^ 0, (2) 



where a is a translation rate, and "fX a degradation rate. Wc neglect a dilution effect (due to cell growth) 
and other extrinsic noise sources, which however will be taken into account later in this section. 

Consider the above reaction system in the deterministic framework. When the number of X is 
perturbed to increase from the stationary value, the characteristic time to reach the stationary state is 
I/7 [28]. In the stochastic description, stochastic fluctuations in X, deviated from the stationary state 
mean value, will spend a time I/7 typically in reaching the mean value (see Fig. [2]). This means that the 
autocorrelation function decreases with a time constant 1/7, more specifically, the autocorrelation can 
be shown to decrease exponentially: Gx(t) = Gx(0)e n [27]. After applying logarithms on both hand 
sides, we obtain log Gx(t) = logGx(O) — jt. The semi- log plot of Gx(t) becomes a linear line with its 
slope —7, which is the degradation rate constant up to the negative sign. Thus, this implies that the 
deterministic retroactivity (d-retroactivity), quantifying the slow-down in response due to wiring [10], 
can be measured from the slope change. 

We will next wire the output to another module (see Fig. [2]). Wiring means allowing X to bind 
to the promoter region of a downstream module, either by inserting specific promoters for X or by 
adding inducer if the promoter was already present but its activity was suppressed. The system with the 
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Figure 2. Stochastic fluctuations in a fluorescence signal (denoted by Y(t)) and its autocorrelation 
function Gy(r). The signal F(f) fluctuates stochastically with respect to its constant mean level. An 
autocorrelation, indicating the correlation of a signal with itself at a given time difference, shows that 
the correlation typically decreases as the time difference r increases. If the autocorrelation function 
decreases exponentially with t, i.e., Gy(t) = Gy(0)e _ T, the correlation time constant is defined by T. 
This time constant is roughly the same as the time taken for a perturbation in the signal to be relaxed 
to the the mean level. When the signal is used for the input of another circuit component 
(B. Connected Case), the signal Y shows longer correlations in time. The correlation time increases and 
the apparent degradation rate constant decreases. 



connection can be modeled as: 



■yX 



» X — ■+ 0, and I + P/ 



k on XP f 



A, 



(3) 



where Pf and Pb denote free and bound promoters, respectively. The binding-unbinding process can be 
considered much faster in its relaxation to the stationary state than the translation-degradation process 
(k on X + k ff ^> 75 cf. [11,29]) and is assumed to be in quasi-steady states. 

In this subsection, we will focus more on the total number of the TFs rather than the number of the 
free TFs (X). The reason is that the total number can be experimentally observable if the fluorescence 
does not change when the TFs are bound, and we will denote the total number by 

Y = X + P b . 

If the TFs have fluorescence tags, they will bind to the promoter region and the fluorescence intensity 
will reflect the sum of the copy numbers of both bound and unbound TFs. The second reason that we 
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focus more on the sum Y is that it directly reflects the dynamics of the time scale of our interest (of the 
order of cell-doubling time or less). In this time scale, the fast binding-unbinding reactions occur many 
times, and the number fluctuations in the free TFs are rapid, and these fluctuations can be considered 
averaged out in this time scale. Thus it is natural to consider a variable that does not fluctuate due to 
the binding-unbinding process. The total number Y satisfies this property since the total number of the 
TFs does not change while a TF binds to or unbinds from its specific promoter region if a TF is not 
translated and does not degrade yet. Thus, the variable Y can be considered a pure slow mode [10, 12]. 

The slow mode Y can fluctuate only when TFs are translated and degrade. This indicates that the 
translation-degradation process §S§ can be equivalently described (whether the binding-unbinding process 
is fast or not) by 

a iX k on XP f 

— 2 — ► Y and X + P f , P b . (4) 



This process can be simplified further by using the fact that the binding-unbinding events occur much 
more frequently than that of the translation-degradation process. We assume that the fast fluctuations 
in the free TF number are averaged out in the time scale of the translation-degradation process. We 
can consider the degradation rate of Y to be approximately given as jX ~ j(Y — (Pb)y), where (Pb)y 
denotes the average copy number of the bound TFs for a given value of Y, which can be estimated 
from the probability distribution function for P obtained under the assumption of equilibrium in the 
binding-unbinding process (see the Appendix). Thus, the above process for Y can be approximated to 

a „ l(Y-(P b ) Y ) 

► Y > . (5) 



Before we discuss how to define retroactivity in the stochastic regime, we will review the determin- 
istic retroactivity (d-retroactivity) [10]. Consider the process ([5]) in the deterministic framework. The 
degradation rate of Y is expressed as j(Y — P b *), where P b * denotes the number of the bound TFs for 
a given value of Y, estimated under the equilibrium assumption in the binding-unbinding process [10]. 
This degradation rate function was shown to be highly nonlinear in Y [10, 13]. We have plotted the 
graph for the rate function vs. Y in Fig. [3J The slope of the graph at Y e , with Y e the equilibrium 
value of Y, indicates the speed of the relaxation when Y is perturbed from Y e and thus the slope is the 
apparent degradation rate constant (70) ■ The inverse of the degradation rate constant indicates how 
long the relaxation lasts. Since the degradation rate function is highly nonlinear, the derivative changes 
significantly for different values of Y. In [10], it was shown that the derivative was always smaller than 
that of the disconnected case [10]. Thus, the relaxation time (I/70) of Y increases significantly compared 
with the case when disconnected. The d-rctroactivity is defined as the relative change of the apparent 
degradation rate constant of Y [10]: 

R* = 1 — J± = T~~ 7T 5 "' (6) 

i+ttt) 
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Now we return to the stochastic framework and investigate how to define the retroactivity in this 
framework. The stochastic fluctuations in Y can be centered around the nonlinear region of Y (as shown 
in Fig. 2]) such that the discreteness of Y needs to be carefully taken into account. In such a case, 
we cannot make a clear mathematical definition of retroactivity; since the derivative of the degradation 
rate function with respect to Y is not well defined. Instead, we define the retroactivity (R s ) from the 
autocorrelation function of Y: 

R s EE l^ll, (7) 

7 

where 

dlogGytr) . . 

7 a = — - — , for the connected case, and (8) 

dr 
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Figure 3. Rate functions for the translation and degradation processes of the total transcription factor 
Y. The processes are described by ((2]) and ([5]). X e and Y e are concentrations of the transcription factor 
at equilibrium. In the connected case, the degradation rate becomes approximately jY for sufficiently 
large Y and the apparent degradation rate constant 7 a becomes always smaller than 7 [10]. 



The relative change of the slope of a graph of log G(r) vs. r corresponds to the retroactivity. Since the 
value of such defined retroactivity can deviate significantly from that of the d-retroactivity, we call this 
stochastic retroactivity. 

We have estimated the stochastic retroactivity by measuring the slope change in the semi-log plot of 
the autocorrelation functions of Y for the two different processes and . We have used parameter 
values appropriate for degradation tagged TFs in E. coli host cells: the average copy number of the TF 
is set equal to 2 and the dissociation constant of the TF specific promoters between 0.001 and 100 nM 
and the average copy number of plasmids containing the specific promoters to 1 and 100 (for the reaction 
parameter values, refer to the caption of Fig. [6] and [5j. We have set the volume of E. coli roughly equal 
to 1/xm 3 , and for this volume the copy number 1 corresponds to 1 nM. Hereafter we will interchange the 
unit of nM with that of a copy number. We have fitted the autocorrelation to an exponential function. 
The measured s-retroactivity is shown to be well matched with the d-retroactivity (Rd) as shown in 
Fig. [5] and [5j This result implies that the d-retroactivity can be used to estimate retroactivity for the 
case when stochastic fluctuations are significant, e.g., the case of low copy number TFs. Our result shows 
the possibility that the insulation devices, proposed in the deterministic framework [10] for suppressing 
the retroactivity, can be applied to gene circuits, which are highly stochastic. 

Dissociation constants for transcription factors and their own operators in promoter regions range from 
pM to several hundred nM for bacterial transcription factors. For example, TetR proteins bind to the 
operator tetO with a dissociation constant Kd = 0.001 — 15nM (in vitro) [31], lad with Kd = 0.1 — lOpM 
(in vivo) [32], luxR with Kd = 24nM (in vitro) [33], cl with Kd = 120nM (in vivo) [18]. For this range 
of the value of the dissociation constant, the stochastic retroactivity becomes non-negligible as shown 
in Fig. O implying that the autocorrelation in the output signal noise changes significantly due to the 
output wiring: More specifically, the time correlation in the signal noise can persist much longer after 
the output is wired. 

In this estimation procedure for measuring the slopes, one needs to choose fitting regions of r. Based 
on stochastic simulation results, the autocorrelation functions approximately become pure exponential 
functions for the region of small r, between and 100 ~ 250 min, and when the copy number of unbound 
TFs is larger than 1 (see Figs. [6] and [5]). Thus, we have fitted the autocorrelation to an exponential 
function for this range of r and we will justify the reason for choosing the small value region of t later in 
this Result section. 
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Figure 4. The degradation rate of Y, r ){X)y 
The average copy number of the unbound TFs 
number (Y) of the TFs for the reaction process 
and their specific promoters are assumed to be 
value of Y was estimated by using Eq. (fT5]) . and 
performing numerical simulations based on the 
above plot is for the case of = lpM, and Pt 
k off = 0.000167(l/min)]. 
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Retroactivity: Relative decrease in frequency bandwidth of power spectral 
density 

In this subsection, we will make a connection between retroactivity and the frequency response of a 
gene circuit. Retroactivity will be shown to be related to the change in the frequency bandwidth of a 
circuit due to the circuit output wiring. Frequency bandwidth of a circuit is defined in the deterministic 
framework: For example, in the process ([3|), the translation rate a is considered as a time- varying input 
signal (e.g., a sinusoidal function), and one observes the output signal of the circuit to estimate the 
attenuation (response) of the signal for different magnitude of frequencies. Here in the gene circuit, the 
stochastic fluctuations in gene expression levels are significant, and we question "Can one exploit the 
stochastic fluctuations to understand the frequency response of the circuit?" Simpson, ct al. performed 
the frequency domain analysis of noise in a negatively-autorcgulatcd gene circuit and showed that the 
noise frequency spectrum (power spectral density) is significantly dependent on the negative feedback 
loop structure [19,34]. Mathematically, they showed the dependence in terms of the transfer function of 
the feedback loop. This means that the transfer functions, derived in the deterministic framework, can 
be, at least partly, identified by investigating the noise frequency spectrum. Therefore, we will investigate 
the noise frequency spectrum to investigate the change in frequency response due to the circuit wiring. 

We perform the Fourier transform of the autocorrelation function of Y, resulting in a (two-sided) 
power spectral density. If the estimated autocorrelation function of Y can be well approximated to a 
pure exponential given by Wi(2^f a )~ 1 cxp(— 7 a |i~|) (in the previous section, we assume r > 0, but here r 
can be negative), its Fourier transform is given as 

1 Wj 

GvH^-^-^. (10) 

This power spectral density (PSD) decreases by half from its maximum when to is equal to the apparent 
degradation rate constant, j a , and thus this value is called the bandwidth of the power spectrum of Y. 
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Figure 5. Retroactivity for the deterministic and stochastic cases for the different values of Kd and the 
number of free TFs, (X). The unit of Kd is nM. The lines in the above plots correspond to 
d-retroactivities (Eq. ([5])) and the points to s- retroactivities. Circles indicates the cases that the 
autocorrelation functions of Y cannot be approximated to pure exponential functions as shown in Fig. [5] 
and the s-retroactivities for these cases were estimated by choosing fitting regions of t between 0, and 
25 ~ 50 min. For the rest of the cases, the autocorrelations can be approximated to pure exponential 
functions for the wider fitting regions of r between and 100 ~ 250 min. 



This indicates that the bandwidth decreases from 7 to j a = 7(1 — R s ) due to retroactivity when two 
modules are connected (see Fig. [7j\ and B). The relative decrease in the bandwidth can also be used to 
define the retroactivity. 

Finally, we discuss how to reduce the change in the frequency bandwidth, i.e., retroactivity. Del 
Vecchio et al. proposed two mechanisms for this: a negative feedback on X and the amplification in the 
degradation rate of X [10]. More specifically, the negative feedback increases the apparent degradation 
rate constant since any perturbation in X will damp away more strongly due to the feedback, and 
the amplification of the degradation rate directly increase the degradation rate constant. Thus, these 
mechanisms can reduce the retroactivity. 

Modularity in circuit dynamics and two choices of output signals: X or Y? 

Figure [7] shows very interesting aspects of the frequency responses in the circuit. The noise spectrum of 
signal Y (the total number of the transcription factor) changes significantly at low enough frequency while 
it does not at high enough frequency, but that of signal X (the number of the unbound transcription 
factor) shows the opposite behavior. This implies that the signal X becomes not affected by wiring 
when the circuit is operated at the low frequency, and the signal Y does not when operated at the high 
frequency, although the retroactivity is still causing the significant change in the frequency bandwidth. 
This means that circuit dynamics can be investigated in a modular way by choosing an appropriate 
output signal, cither X or Y, depending on the magnitude of the operating frequency 
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Figure 6. Autocorrelation functions of Y and A, when a downstream module is either connected or 
disconnected: The slopes of the semi-log plot of the autocorrelation functions correspond to 
7 a = 7(1 — B, s ) (Eq. ©) up to a negative sign. The lines labeled 'Gillespie' corresponds to stochastic 
simulation results based on the Gillespie stochastic simulation algorithm [30] . The lines labeled 
'Deterministic' are exponential functions: C exp [ — 7(1 — -Rd)r] with Rd the corresponding deterministic 
retroactivities Eq. ^ and C a normalization constant, chosen for the overlap with the 'Gillespie'. The 
autocorrelation functions of A is not exponential (B). The sudden drop for the small value of r ~ [0, 5] 
is originated from fast binding-unbinding processes. This implies that the slow mode Y of dynamics 
gives dominant contributions to the autocorrelation for the larger value of r. The following parameter 
values are used: K d = O.lnM, (X) = a/7 = 2nM, and P t = lOOnM [a = 0.0334(~ 1/30) 11M min" 1 , 
7 = 0.0167 (~ 1/60) min" 1 , k on = 0.167 (~ 1/6) nM^min^ 1 [17], and k off = 0.0167(~ 1/60) min -1 ]. 

To validate the above prediction, wc simulate the process ([3]) deterministically, with a allowed to 
oscillate at different frequencies, which are chosen from three different regions indicated in Fig. [7j 

a(t) = 1 + 0.1 sin(27r/i) (11) 

with / = 0.001/27T, 0.01/27T, and 0.05/27T (see Fig. [HJ- Output signals X and Y show oscillations with 
basal levels: X(t) = X + SX(t) or Y(t) = Y + SY(t). We compare the oscillatory components (SX 
and SY) of the output signals for the two different cases when the output is connected and disconnected 
(Fig. |SJ). At the low frequency / = 0.001/27T, 5X does not change due to wiring while SY does. At the 
high frequency / = 0.05/27T, SX changes due to wiring while SY does not. At the intermediate frequency 
/ = 0. 01/27T, both the signals X and Y change. This verifies the prediction based on power spectral 
densities, and implies that the modularity of gene circuits can be tested by observing gene expression 
fluctuations. 

Now, we will explain in detail the origin of this different behavior of X and Y. First, we investigate 
the case that the circuit operates in the low frequency region. The mean level of A, corresponding 
to zero frequency component, is determined by its synthesis and degradation rate because binding and 
unbinding reactions are balanced ((A) = a/7). The same argument is applied for the low frequency 
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Figure 7. Power spectral densities (PSD) (one-sided) obtained from the autocorrelation functions of Y 
and X, respectively shown in Fig.[6j\- and B. The low frequency components of signal X are not 
affected by the connection to a downstream component (B), while the high frequency components of Y 
are not (A). The threshold point, where the PSD starts to drop down, is shifted to the left for both X 
and Y, due to the connection. Three regions of frequency colored above correspond, respectively from 
left to right, to low frequency region where the signal X is not affected while Y is, intermediate region 
where both X and Y are affected, and high frequency region where Y is not affected while X is. 



region (w) colored in dark gray in Fig. [3 where the balance still holds, and the two PSD of X when 
connected and disconnected are almost the same as each other. This means that the low frequency 
components of signal X is not distorted by the wiring. Therefore, when the dynamics of the individual 
modules are described by X, the signal distortion in X due to the retroactivity can be minimized by 
operating at the low frequency. 

We consider the case that the circuit operates in the high frequency region. In this case, we investigate 
the signal Y rather than X, because the two PSDs of Y when connected and disconnected converges to 
each other at the high frequencies as shown in Fig. [7] More specifically, the PSD of Y becomes Wi/2ituj 2 
as lo ^> 7 > 7a by using Eq. f) 1 0[) . This implies that when the dynamics of the modules are described by 
Y, the signal distortion in Y due to the retroactivity can be minimized when the circuit is operated at 
the high frequency. We note that the PSD of Y at the low frequencies changes due to the wiring because 
there is an extra amount of bound TFs when wired. This PSD study implies that modular description is 
possible by choosing the right signal variables either X or Y depending on the operating frequencies. 
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Figure 8. Signals X and Y are compared for different values of frequency when process ([3]) is 
dctcrministically simulated. The translation rate a is given as a time- varying signal with a harmonic 
oscillation as shown in Eq. (TlTj) . The frequency of the oscillation is chosen from the three regions shown 
in Fig. [7J The parameter values are identical to those used in Fig. [51 



Non-linear degradation rate of Y 

Non-linear degradation rate of Y causes the autocorrelation function to be non-exponential. First, we 
discuss why the degradation rate of Y becomes non-linear. We have assumed that the degradation 
of bound TFs is negligible compared to that of unbound TFs. When the copy number of the total 
TFs is much larger than that of the specific promoters, most of the TFs are found to be unbound, 
and mathematically this is expressed as Y ~ X. Thus, the degradation rate of Y (= jX) becomes 
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x (min) 

Figure 9. Semi-log plot of normalized autocorrelation functions with non-exponential shapes. 
Parameters: K d = O.lnM, (X) = a/7 = O.lnM, and P t = InM [a = 0.00167nM/min, 
7 = 0.0167(l/min), k on = 0.167(l/nM/min), k ff = 0.0167(l/min)]. For the explanations on the 
legend, refer to Fig. \§\ 

approximately proportional to Y: ~ jY. However, if the copy number of the total TFs is comparable 
to that of the specific promoters, most of the TFs become bound to the promoter regions. Since these 
bound TFs degrade very slowly, the net degradation rate of the total TFs becomes significantly reduced 
(Fig. 2]). Thus, the degradation rate of Y is not proportional to Y. 

Such nonlinearity causes the autocorrelation function of Y not to be a pure exponential function 
as shown in Fig. [9] (while the linear degradation rate causes the autocorrelation function to be a pure 
exponential function as we have discussed previously). The reason of the non-exponential autocorrelation 
is that the fluctuations in Y are composed of many modes of fluctuations with different correlation 
time constants. The slower modes among them can persist its correlation for the longer time interval, 
contributing more dominantly to an autocorrelation function for the region of large r. Consider the case 
of the nonlinear degradation shown in Fig. [4] Most of the fluctuations occur between 99 and 102. Among 
them, the fluctuations between 100 and 102 will have the correlation time constant ~ 1/7, while those 
between 99 and 100 will have the correlation time constant much larger than I/7. Thus, the slope of 
the autocorrelation (in the semi-log plot) becomes tilted upward for large values of r > 200 as shown 
in Fig. [5] This slope change, which typically appears in our simulations, becomes negligible for the case 
that (X) > 1 for the time interval of less than several hours (see Fig. [5J. 

Fitting regions of X for s-retroactivity measurements 

The remaining question is to be 'why is the slope in the small t region (e.g., [0,50] in Fig. [9|) related 
to the d-retroactivity?' To answer this question, we need to discuss how the deterministic retroactivity 
can be estimated. In [10], the retroactivity was measured (in simulations) either by applying static or 
oscillatory perturbations in the TF creation rate. The strength of the perturbations needs to be small 
enough that the degradation rate of the total TFs changes in proportion to the perturbation strength. 
To make a correspondence to the deterministic case, in the stochastic framework, we have to consider 
small-amplitude fluctuation components of Y to estimate the retroactivity If a system is not multi-stable, 
small-amplitude fluctuations occur more frequently than large-amplitude fluctuations. To measure the 
contribution of the dominant fluctuations (which are the small-amplitude ones), we need to consider 
the autocorrelation for the sufficiently small value of t; for the small value of r, all the fluctuation 
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components, with relaxations either slow or fast, can contribute to the autocorrelation and thus the 
major contribution comes from the fluctuation components occurring most frequently, which are the 
small-amplitude fluctuations. This is why we have chosen the region of the small value of r for the 
estimation of the retroactivity. By choosing the fitting region of r sufficiently small enough (between 
and 25~50 min), the s-retroactivity is shown to be well matched with the d- retroactivity for the reasonable 
parameter values for experiments. 

Computational approach for estimating modularity from gene expression fluc- 
tuations 

In this subsection, we show how to computationally estimate the level of modularity based on fluorescence 
data from single cell experiments. Synthetic circuits are often transferred to host cells, e.g., E. coli, which 
duplicate themselves by completing cell cycles. This causes the intra-cellular concentrations to fluctuate. 
In addition, the host cells are under other unidentified extrinsic noise sources. Such extrinsic noise 
has been shown to affect the autocorrelation functions [18-21] and needs to be taken into account for 
estimating the level of modularity. 

If a transcription factor with a fluorescence marker is tagged for degradation, the lifetime of the TF 
can be comparable to the cell doubling time (Td). Then, the autocorrelation function of the fluorescence 
emitted from the TF can be fitted to the following function: 

Gy(r) = Ae- &ET + Be- {SE+ ^ )T 7 (12) 

where Se = log(2)/Td with Td the cell doubling time and j a the apparent degradation rate constant, 
defined in Eq. Q. The above form of the autocorrelation has been investigated in its Fourier transform 
(power spectral density) by Austin et al. by using a plasmid containing a GFP variant with its half-life 
reduced [19]. They did not consider the retroactivity dependence in j a . The power spectral density, a 
Fourier transform of the autocorrelation function Eq. (fT2"|) , is obtained as [19]: 

ri \ - 2Me 2B(5 E + ~f a ) 

Depending on the relative magnitude of A and B, the bandwidth is determined by either Se and Se + 7a 
or both. 

We propose the following data analysis procedures, to estimate retroactivity: 

1. Obtain single cell fluorescence trajectories by connecting each trajectory belonging to a cell lineage, 
by following the method described in [19]. 

2. Estimate the autocorrelation function of the fluorescence. 

3. Perform a non- linear fit by using Eq. (|12[) to estimate Se and "f a - Alternatively, estimate Se = 
log(2)/T(Z by measuring cell doubling time (Td) from the cell lineage, and then perform a non-linear 
fit to estimate j a by using Eq. (fT2"|) . 

5. Estimate the retroactivity: Repeat the above procedures in the connected and disconnected case 
and obtain the relative change in the apparent degradation rate constants by using Eq. |(7J). 

We can identify the frequency regions that the circuits operate in a modular fashion: 

1 . Obtain power spectral densities by applying the Fourier transformations on the estimated autocor- 
relation functions. 
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2. Identify the region of frequency where the two PSDs obtained in both the connected and discon- 
nected cases overlap each other. This is the region of frequency where signal Y is preferably chosen 
for the modular analysis. 

3. Identify the region of frequency where the two PSDs of Y becomes parallel to each other. This is 
the region of frequency where signal X is chosen. 

We present a model for gene expression in the Appendix where we derive the above autocorrelation 
function Eq. (|12p , which will be shown to be consistent with the current experimental measurements 
[18-21]. The model process is described in the Langevin dynamics framework, with the retroactivity 
effect taken into account. In the derivation of Eq. (p~2|) . we neglect the reaction processes of fast relaxation 
(< 10 min) due to transcription, and mRNA-degradation [19]. 

If the copy number of unbound TFs is very low (X) < 1, the autocorrelation may not be a pure 
exponential due to both the extrinsic noise and the nonlinearity of the degradation rate of Y. In this 
case, the fitting region becomes comparable to the fast relaxation time scale and we need to take into 
account the contribution of the fast time scale dynamics to the autocorrelation functions (shown in 
Eq. (|14|)). The fast time scale dynamics is, however, not well understood compared to the slow time scale 
dynamics [18-21]. In addition, taking into account the fast time scale dynamics increases the number of 
fitting parameters. Therefore, an accurate estimation can be less promising in the case for the low copy 
number of the unbound TF. 

If the degradation process of bound TFs is not negligible, retroactivity be- 
comes reduced. 

In this subsection, we take into account the degradation of bound transcription factors. The total copy 
number of the transcription factors (Y) is still a pure slow mode because the degradation of the bound 
transcription factors may not be faster than the unbound ones. The degradation of the bound TFs (Pb) 
contributes to the degradation of Y: 

► Y ►, i.e. > i > 

with 7f, the degradation rate constant of the bound TFs and for the reaction process shown on the right, we 
used X = Y — Pb. When the bound TFs degrades at the same speed as the unbound TFs, the degradation 
rate becomes proportional to Y with its rate constant equal to 7, resulting in zero retroactivity. For the 
case of 7 7^ 7b, however, the apparent degradation rate constant becomes 

7a = 7(1 - (1 - lbh)R), 

where R is the retroactivity in the case that the degradation of the bound TFs is negligible (Eq. © or 
©). Thus, the retroactivity is expected to decrease by a factor of 1 — 7b/7 when the degradation of the 
bound TFs are allowed. 

Summary 

In this paper, we have considered a gene transcription-translation module and have investigated how noise 
correlations in the circuit signals change after the output of the circuit module is wired to another module. 
We call such changes in noise correlation, stochastic retroactivity. More specifically, the retroactivity was 
defined from the autocorrelation functions of gene expression levels of transcription factors (TFs). When 
the TFs can bind specifically to downstream promoters, the autocorrelation is shown to decrease more 
slowly compared with the case when not wired. This implies that the noise correlation persists longer 
and that the module filters out signal components corresponding to high frequencies more strongly. 
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We have made a connection between the stochastic retroactivity and the retroactivity investigated 
in the deterministic framework [10]. For reasonable parameter values for experiments, our simulation 
study shows that the autocorrelation functions approximately become pure exponential functions (if we 
neglect extrinsic noise) and the estimated stochastic retroactivity is well matched with the deterministic 
retroactivity. 

We have shown that the retroactivity is related to the amount of decrease of the frequency bandwidth 
of the power spectral density of the gene expression. We have proposed that depending on the magnitude 
of an operating frequency, the circuit dynamics can be analyzed in a modular fashion by choosing the 
appropriate circuit output signal: at low enough frequencies, the concentration of the free TF, and for 
high enough frequencies, the concentrations of the total TF. 

This autocorrelation study provides a experimental method for measuring the retroactivity to test 
the level of modularity in gene circuit dynamics. This will eventually help for the modular design of the 
gene circuits with elaborate dynamical functionality. 



Materials and Methods 

Gene expression models and autocorrelation functions 

In this section, we present a stochastic model of gene expression for transcription factors (TFs), taking into 
account retroactivity due to binding-unbinding interactions between the TFs and their specific promoter 
region. We aim to describe the dynamics of the processes in the slow time scale, where we assume the 
binding-unbinding processes of the TFs to be in equilibrium. In the slow time scale, the dynamics can 
be described by a slow variable, the total number of TFs (Y), including bound and unbound TFs. We 
assume that the translation of the TF occurs at a constant rate and the degradation at a rate proportional 
to Y with a apparent degradation rate constant, *y a . The reason for the latter assumption is based on the 
simulation result that its autocorrelation can be approximated to be a pure exponential function for the 
case that the average number of the unbound TFs is larger than one. Then, we can model the process 
by the following Langevin equation: 

dY 

— = a - 1 'Y + I + E 7 (13) 
where 7' is the sum of the apparent degradation rate constant (7 Q ) and dilution rate constant (Se)' 

7' = 7a + S E - 

I and E represent intrinsic and extrinsic noise, respectively, which are exponentially-correlated [19-21]: 

(I{t)I(t + T)) = Wiexp{-r/Ti), 

and 

(E(t)E(t + r)> = W E exp(-r/T E ), 

where the angle bracket denotes an average over a time series at the stationary state. 

To obtain the autocorrelation function of Y, we obtain the integral- type solution of Eq. (fl3|) for Y(t) 
and substitute the solution into Eq. ([T]), resulting in: 
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We will show that the above autocorrelation function, Eq. (fTi)) . is consistent with the functional 
forms obtained or used in [18,19,21]. To show the consistency with Austin et al. [19], we consider 
that the internal noise correlation time is much smaller than the extrinsic noise correlation time (i.e., 
8 1 3> Se)- We can simplify the above equation to Eq. (fT2"|) . giving an identical result presented in 
[19] up to the normalization constant. In [18,21], the authors did not use degradation-tagged TFs 
in measuring the expression level. The life time of the TFs was much larger than the extrinsic noise 
correlation time (Se S> 7 ^ la, i-c, 7' — Se)- In this case, the autocorrelation can be simplified to 
A' exp(— Sit) + B' exp(— Set) (where internal noise correlation is not ignored) [18,21]. Therefore, this 
consistency supports the Langevin dynamics description Eq. Q13p for gene expression, taken into account 
the retroactivity, are valid within the accuracy of current experimental data. 



Slow mode (Y) dynamics 

In this section, we justify the reaction process of Y , described by Eq. ([3]), and derive the average copy 
number of the bound transcription factor, (P )y- The TF degradation is assumed to be much slower 
than the equilibration time of the binding-unbinding process between the TF and its specific promoter 
regions: k on X + k off » 7, cf. [10,11,29]. 

We construct the master equation for Y and P b : 

+P(Y, P b - l)k on (P t -P b + 1)(Y -P b + 1) 

+P(Y + 1, P b )j(Y + 1-P b ) + P(Y, P b + l)k off (P b + 1) 

-P(Y, P b ) [a + jX + k off P b + k on (P t - P b )X] . 

As in the deterministic case, we assume the transcription factor binding-unbinding process is in a quasi- 
cquilibrium state, conditioned on the slow variable, i.e., P(Y,P b ;t) = P(P b \Y)P(Y;t) [22]. This approx- 
imation simplifies the above equation to another master equation for the slow variable Y: 

^^.=a[P(Y-l;t)-P(Y;t)] 

+j(Y + 1 - (P b ) Y )P(Y + l;t)-(Y- (P b ) Y )P(Y; t). 

This means that the slow process can be simplified to the reaction process described by Eq. ([5]) . 
Under the equilibrium approximation, the fast mode process satisfies the detailed balance: 

koff(Pb + 1) P(Pb + l\Y) = k on (P t - P b )(Y - P b ) P(P b \Y). 

This enables one to obtain the equilibrium probability distribution function of P b . We obtain the average 
copy number of the bound transcription factor, (P )y'- 



p» PAY\ 



k of fJ (p b - iy.(p t - p b y.(Y - p_ 



x P(0\Y), (15) 
where P t denotes the total number of the promoters, Pf + P b . 
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